- 
                Notifications
    You must be signed in to change notification settings 
- Fork 5.1k
[Storage] [WebJobs] LogScan to properly use the target BlobServiceClient when specified #53464
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
[Storage] [WebJobs] LogScan to properly use the target BlobServiceClient when specified #53464
Conversation
…y.RegisterAsync; Added catch for possible permissions failure to default to primary
| It would be great if this could be merged and released soon! We have also opened a case with Azure Support about this (#2510070050000897). Trace that show restartExceptiont that is logged just before this traceStacktrace for this exceptionSee support case #2510070050000897. | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some initial feedback
| @@ -11,7 +11,11 @@ namespace Microsoft.Azure.WebJobs.Extensions.Storage.Blobs.Listeners | |||
| { | |||
| internal interface IBlobNotificationStrategy : ITaskSeriesCommand, IBlobWrittenWatcher | |||
| { | |||
| Task RegisterAsync(BlobServiceClient blobServiceClient, BlobContainerClient container, ITriggerExecutor<BlobTriggerExecutorContext> triggerExecutor, | |||
| Task RegisterAsync( | |||
| BlobServiceClient primaryBlobServiceClient, | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should be passing this "primary" account down anymore. It's not needed. There are only 3 derived strategies and:
- PollLogsStrategy: Should be pointing to the $logs of the TARGET account, not host account, which is the bug you're fixing. Doesn't need host account.
- ScanBlobScanLogHybridPollingStrategy: creates a PollLogsStrategy internally, so the above applies. Additionally, it's using a BlobScanInfoManager instance which itself is pointing at the host storage account already which is correct - we want to store these scan breadcrumbs in host storage not target storage. However the AccountName being passed into IBlobScanInfoManager.LoadLatestScanAsync is currently the host storage account name which isn't correct I think. It should be the target storage account. @brettsam please confirm.
- ScanContainersStrategy - doesn't even use the BlobServiceClient param
So all of these strategies only need the target storage client and container.
| { | ||
| BlobLogListener logListener = await BlobLogListener.CreateAsync(blobServiceClient, _logger, cancellationToken).ConfigureAwait(false); | ||
| _logListeners.Add(blobServiceClient, logListener); | ||
| // If no target client is specified, use the primary. | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see why we need this special handling for if a target isn't specified. We're already taking a BlobContainerClient that is coming from the "target" account. Ultimately that's coming from when the trigger binding was created here. The "data" client is the "target" client. I think we can simplify here, particularly based on my other comment not to even pass the "primary" account down.
| { | ||
| Logger.LoggingNotEnabledOnTargetAccount(_logger, targetBlobServiceClient.Uri.AbsoluteUri); | ||
|  | ||
| // Fallback to primary client if target client does not have the permissions to be used. | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I get that you're trying to be defensive here with this fallback, but we're falling back to incorrect behavior that doesn't address the bug.
I'm thinking instead, the EnableLoggingAsync should be a TryEnableLoggingAsync method that handles the permissions error and logs an error that the customer will see. The error should indicate that logging needs to be enabled on the account, we tried to do that automatically but couldn't, etc. Resolution will be for customers to do this themselves.
Ideally we'd allow this to be fatal just as it is now targeting the primary storage account, but there is a high chance for regressions here since customers may be targeting secondary accounts without the right permissions. So I think the safest course of action will be to log the message.
Note that this change will be no worse than current behavior. I.e. for customers targeting a secondary account where they don't have permissions for us to enable logging, we're already not triggering on their logs.
| @@ -256,5 +278,22 @@ private void ThrowIfDisposed() | |||
| throw new ObjectDisposedException(null); | |||
| } | |||
| } | |||
|  | |||
| private async Task<bool> CheckLoggingEnabledAsync(BlobServiceClient blobClient, CancellationToken cancellationToken) | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see this being used anywhere?
| @wilgert Yes, the current behavior for poison blob handling of the special "webjobs-blobtrigger" control queue is for the corresponding poison queue "webjobs-blobtrigger-poison" to be created in the "target" storage account, only falling back to the "host" storage account (AzureWebJobsStorage) if queues aren't supported (code here). I suspect that a bug has been introduced here during refactorings over time. The intent was that if queues aren't supported on that storage account to fall back. Currently the code will just create a queue client for the target account and that won't fail until later when it's used. I suspect that long ago we did a runtime check here for queues and did the fallback if we couldn't connect. | 
| 
 Good to learn it is a known issue. So my assumption is right that this PR would fix our issue as well? Or is another change needed? | 
Attempt to address: #49993
In order to ensure polling and scanning operations use the "target" BlobServiceClient specified we had to make changes to how Listeners are added and the BlobListenerStrategy is initialized.
In my attempt here, I did the following:
TODO: